Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Background In the CRISPR-Cas9 system, the efficiency of genetic modifications has been found to vary depending on the single guide RNA (sgRNA) used. A variety of sgRNA properties have been found to be predictive of CRISPR cleavage efficiency, including the position-specific sequence composition of sgRNAs, global sgRNA sequence properties, and thermodynamic features. While prevalent existing deep learning-based approaches provide competitive prediction accuracy, a more interpretable model is desirable to help understand how different features may contribute to CRISPR-Cas9 cleavage efficiency. Results We propose a gradient boosting approach, utilizing LightGBM to develop an integrated tool, BoostMEC (Boosting Model for Efficient CRISPR), for the prediction of wild-type CRISPR-Cas9 editing efficiency. We benchmark BoostMEC against 10 popular models on 13 external datasets and show its competitive performance. Conclusions BoostMEC can provide state-of-the-art predictions of CRISPR-Cas9 cleavage efficiency for sgRNA design and selection. Relying on direct and derived sequence features of sgRNA sequences and based on conventional machine learning, BoostMEC maintains an advantage over other state-of-the-art CRISPR efficiency prediction models that are based on deep learning through its ability to produce more interpretable feature insights and predictions.more » « less
-
null (Ed.)Abstract Ribosome profiling, also known as Ribo-seq, has become a popular approach to investigate regulatory mechanisms of translation in a wide variety of biological contexts. Ribo-seq not only provides a measurement of translation efficiency based on the relative abundance of ribosomes bound to transcripts, but also has the capacity to reveal dynamic and local regulation at different stages of translation based on positional information of footprints across individual transcripts. While many computational tools exist for the analysis of Ribo-seq data, no method is currently available for rigorous testing of the pattern differences in ribosome footprints. In this work, we develop a novel approach together with an R package, RiboDiPA, for Differential Pattern Analysis of Ribo-seq data. RiboDiPA allows for quick identification of genes with statistically significant differences in ribosome occupancy patterns for model organisms ranging from yeast to mammals. We show that differential pattern analysis reveals information that is distinct and complimentary to existing methods that focus on translational efficiency analysis. Using both simulated Ribo-seq footprint data and three benchmark data sets, we illustrate that RiboDiPA can uncover meaningful pattern differences across multiple biological conditions on a global scale, and pinpoint characteristic ribosome occupancy patterns at single codon resolution.more » « less
-
Abstract DNA mechanical properties play a critical role in every aspect of DNA-dependent biological processes. Recently a high throughput assay named loop-seq has been developed to quantify the intrinsic bendability of a massive number of DNA fragments simultaneously. Using the loop-seq data, we develop a software tool, DNAcycP, based on a deep-learning approach for intrinsic DNA cyclizability prediction. We demonstrate DNAcycP predicts intrinsic DNA cyclizability with high fidelity compared to the experimental data. Using an independent dataset from in vitro selection for enrichment of loopable sequences, we further verified the predicted cyclizability score, termed C-score, can well distinguish DNA fragments with different loopability. We applied DNAcycP to multiple species and compared the C-scores with available high-resolution chemical nucleosome maps. Our analyses showed that both yeast and mouse genomes share a conserved feature of high DNA bendability spanning nucleosome dyads. Additionally, we extended our analysis to transcription factor binding sites and surprisingly found that the cyclizability is substantially elevated at CTCF binding sites in the mouse genome. We further demonstrate this distinct mechanical property is conserved across mammalian species and is inherent to CTCF binding DNA motif.more » « less
An official website of the United States government
